Charting the evolutionary path of the SUMO modification system in plants reveals molecular hardwiring of development to stress adaptation

Abstract SUMO modification is part of the spectrum of Ubiquitin-like (UBL) systems that give rise to proteoform complexity through post-translational modifications (PTMs). Proteoforms are essential modifiers of cell signaling for plant adaptation to changing environments. Exploration of the evolutionary emergence of Ubiquitin-like (UBL) systems unveils their origin from prokaryotes, where it is linked to the mechanisms that enable sulfur uptake into biomolecules. We explore the emergence of the SUMO machinery across the plant lineage from single-cell to land plants. We reveal the evolutionary point at which plants acquired the ability to form SUMO chains through the emergence of SUMO E4 ligases, hinting at its role in facilitating multicellularity. Additionally, we explore the possible mechanism for the neofunctionalization of SUMO proteases through the fusion of conserved catalytic domains with divergent sequences. We highlight the pivotal role of SUMO proteases in plant development and adaptation, offering new insights into target specificity mechanisms of SUMO modification during plant evolution. Correlating the emergence of adaptive traits in the plant lineage with established experimental evidence for SUMO in developmental processes, we propose that SUMO modification has evolved to link developmental processes to adaptive functions in land plants.


Introduction
Variation at the protein level (proteoforms) plays a key role in orchestrating biological complexity (Kosova et al. 2021).Posttranslational modification (PTM) events create distinct proteoforms and modulate almost every biological process.This is particularly evident in multicellular organisms, where development requires the integration of signals controlling and coordinating cell fates (switching from division to differentiation).PTMs act at the core of every cellular decision.For example, progression through the eukaryotic cell cycle is regulated by the dynamic interplay of kinases and phosphatases that regulate phosphorylation events controlling a wide range of protein functions from subcellular location to enzyme activity (Ardito et al. 2017).There are >100 different PTMs that modulate every cellular process from transcription to protein stability and function (Vu et al. 2018;Perrar et al. 2019;Kosova et al. 2021).
The ubiquitylation system is one of the first protein conjugation systems to be discovered (Miura and Hasegawa 2010;Callis 2014;Linden and Callis 2020).Ubiquitin, a 76-amino acid protein, is linked via its glycine to lysine residues of substrate proteins through an isopeptide bond (Fig. 1A).The ubiquitin conjugation system-a cascading collection of the enzymes sequentially named E1, E2, E3, and E4-enables this modification.By attaching ubiquitin molecules, proteins can be marked for degradation, localized to specific cellular compartments, or modulated in their activity and interactions (Callis 2014).The ubiquitylation system is involved in fundamental cellular processes, including cell cycle regulation, DNA repair, and protein quality control.In both the plant and animal kingdoms, a large repertoire of proteins has been found to be conjugated to ubiquitin.Further studies into the Ubiquitin system gave rise to the discovery of several other Ubiquitin-like (Ubl) systems with similar conjugating machinery.SUMOylation, NEDDylation, UFMylation, and ISGylation are some of the prominent Ubl conjugation systems (Fig. 1; Fig. 2A) (Hochstrasser 2009;Vierstra 2012;Mergner and Schwechheimer 2014;Villarroya-Beltri et al. 2017;Li et al. 2023).
SUMOylation acting through lysine residues on target proteins (summarized in Fig. 2B) has been firmly established as a vital PTM that affects almost every cellular process studied so far (Benlloch and Lois 2018;Morrell and Sadanandom 2019), but the pivotal importance and regulation of the SUMO system in plants is just beginning to be discovered (Vierstra 2012;Augustine and Vierstra 2018;Benlloch and Lois 2018;Morrell and Sadanandom 2019).SUMOylation involves the attachment of Small Ubiquitin-like Modifier (SUMO) proteins to target proteins, mediated by SUMO conjugation machinery comprising SUMO E1 activating enzyme, SUMO E2 conjugating enzyme, and SUMO E3 ligase.Additionally, SUMO proteases process premature SUMO and remove SUMO from substrates, regulating various developmental and cellular processes (Fig. 2B).SUMO was first discovered as a conjugated protein on Ran-GTPase activating protein1, RanGAP1 in mammals (Matunis et al. 1996).Subsequent studies have reported the presence of a SUMO modification machinery enabling this posttranslational modification on target proteins in unicellular yeast (Hochstrasser 2009).The SUMO modification system has been extensively studied in animals, including humans, where SUMOylation is established as a major modifier of cell signaling, particularly during stress responses (Yeh 2009;Bettermann et al. 2012;Flotho and Melchior 2013;Celen and Sahin 2020).
In plants, SUMO modification is emerging as a key mechanism by which complex biological processes from stress perception to chromatin changes are orchestrated at the cellular level (Novatchkova et al. 2012;Vu et al. 2018;Morrell and Sadanandom 2019;Celen and Sahin 2020).SUMO conjugation has been shown to be promoted by a plethora of abiotic and biotic stresses with overlapping and distinct phenotypic outputs depending on target substrates in plants (Novatchkova et al. 2012;Vu et al. 2018;Morrell and Sadanandom 2019).In Arabidopsis, SUMO conjugation predominantly involves 2 different forms: AtSUMO1 and 2. Mutants that fail to promote SUMO1/2 attachment onto target proteins display deregulated immunity and a drastic inability to cope with abiotic stresses.Recently, we and others have shown that major developmental decision processes are enabled by the SUMO system (Han et al. 2016;Gou et al. 2017;Orosa et al. 2018;Srivastava et al. 2020Srivastava et al. , 2022;;Verma et al. 2021).These data underline the importance of SUMOylation in diverse processes that govern plant development and adaptation to their environment.However, there is a gap in our understanding of the evolution and functional diversification of the SUMO system across the plant kingdom.
In this review, we highlight the major differences and similarities between the machineries that drive ubiquitylation and SUMOylation in plants.We explore the emergence of the SUMO machinery across the plant lineage from single cell to land plants.This has allowed us to identify potentially critical components of the SUMO system that were selected for gene expansion as plants successfully adapted to diverse environmental conditions on land.Correlating the emergence of adaptive traits in the plant lineage with established experimental evidence for SUMO in developmental processes, we propose that SUMO modification has evolved to link developmental processes to adaptive functions in land plants.

Analogy between SUMOylation and ubiquitylation and key differences
Ubiquitylation, discovered as one of the initial protein conjugation systems, laid the groundwork for understanding post-translational modifications that regulate protein function across the biological spectrum.This discovery was followed by the identification of the SUMO conjugation machinery, marking another significant advance in our comprehension of the cellular regulation of proteoforms (Matunis et al. 1996;Hochstrasser 2009;Celen and Sahin 2020).Most of the foundational research has focused on ubiquitylation, uncovering its novel mechanisms of action and regulatory roles in both the plant and animal kingdoms.Despite these advances in understanding ubiquitylation, the exploration of SUMOylation, particularly within the plant kingdom, remains in its early stages.
Protein conjugation machineries of ubiquitylation and SUMOylation share a common feature: they both require an E1-E2-E3 conjugation cascade.Ubl protein conjugation machinery has been reported in prokaryotes, which involves the attachment of sulfur compounds ThiS (ThiaminS) and MoaD (Molybdenum cofactor biosynthesis protein D) to proteins.This system consists of an E1-like enzyme that attaches sulfur moieties to the C-terminal residues of proteins, forming thiocarboxylates.ThiS and MoaD have the characteristic β-grasp motif (Humbard et al. 2010;Maupin-Furlow 2013, 2014).The presence of a protein conjugation system based on a β-grasp motif-containing protein in prokaryotes suggests the Ubl modifier system could have evolved from simpler sulfur conjugation mechanisms found in prokaryotic predecessors, shaped by the sulfur-rich prebiotic environment.
Perhaps the most fascinating difference between Ubiquitylation and SUMOylation lies in the conjugation machinery of these modification systems.The Ubiquitin gene is encoded as polyubiquitin moieties and occurs as identical tandem repeats in the genome.These chains can attach to the substrate protein via polyubiquitylation, or C-terminal hydrolases process them into single units for conjugation to substrates.Contrastingly, SUMO is encoded as individual units with a C-terminal extension beyond the di-glycine motif, which undergoes proteolytic cleavage, leaving a di-glycine motif to conjugate through the E1-E2-E3 complex.
The mature SUMO or Ub protein is bound to E1 in a highenergy-driven process.The E1-activated ubiquitin/SUMO protein is subsequently transferred to an E2 conjugation enzyme, which interacts with E3 ligases to be conjugated to target substrate lysine residues (Fig. 2A).Finally, the conjugation of ubiquitin/SUMO A B Figure 2. Overview of peptide modification systems.A) In general, a battery of enzymes (E1, E2, E3, E4, ES) processes and conjugates the peptide modifier (C) to the protein substrate.The process begins with the modifier being C terminally processed by ES proteases exposing a terminal glycine, G residue and accepted by an ATP-driven enzyme complex composed of E1 and E2 proteins.Subsequently the modifier is transferred to a third enzyme, E3, or the E2 enzyme itself conjugates the modifier to the substrate protein at the Lys residue (K), forming an isopeptide bond.Another class of enzyme known as E4 may be present that links several modifier residues to the substrate (polyconjugation).Additionally, ES proteases help in cleaving off the modifier from the substrate.Ubiquitylation, Neddylation, and SUMOylation are examples of this type of peptide conjugation systems in eukaryotes.B) SUMO modification machinery involves a pre-SUMO matured to expose the C-terminal GG motif by Ulp type of SUMO proteases before being linked to a Cys residue of E1 (composed of SAE1 and SAE2 as a heterodimer).This is followed by transfer to a Cys residue on E2 (SCE1) and finally to the E3 ligase SIZ1/HPY2.The E2 or E3 can covalently link SUMO to the Lys residue of the substrate protein.The SUMO E4 can attach polySUMOylation chains to the substrate protein that may eventually mark the protein for ubiquitylation by STUbL.SUMO is cleaved from the substrate protein by the activity of SUMO proteases (both Ulp and DeSI type proteases).The SUMOylated substrate can interact with proteins containing SIMs (SUMO interacting motifs).Abbreviations: DeSI, DeSumoylating lsopeptidase; FUG, Fourth ULP Gene class; HPY, High Ploidy2; OTS, Overly Tolerant to Salt; PIAL, Protein Inhibitor of Activated STAT Like1; SAE, SUMO Activating Enzyme; SCE, SUMO Conjugating Enzyme; SIZ, SAP and MIZ1 domain-containing ligase1; SPF, SUMO Protease related to Fertility; UlP, Ubiquitin-like Proteases.
Evolution of the SUMO system in plants | 3133 through isopeptide formation to the substrate occurs through complex formation with the E3 ligase core (Fig. 2A).SUMO/ubiquitin can be cleaved from its substrate proteins via isopeptidase activity of cysteine proteases.These proteases maintain a pool of free SUMO/ubiquitin in the cell (Sadanandom et al. 2012;Yates et al. 2016;Morrell and Sadanandom 2019).Multiple SUMO monomers can join in a tandem manner to form a polySUMO chain on a substrate by a group of E4 ligases known as PROTEIN INHIBITOR OF ACTIVATED STAT LIKE1 (PIAL1) and PIAL2 (Tomanov et al. 2014;Han et al. 2016).This is analogous to Ubiquitin E4 ligases that facilitate Ubiquitin chain formation.
Like Ubiquitin, Nedd8, and ISG proteins, SUMO also has a tertiary structure consisting of the β-grasp orientation (ββαβββ structure) (Fig. 1).SUMO modifiers are encoded by a small gene family in genomes.In Arabidopsis, 8 homologs of SUMO have been identified so far.Among these, SUMO1/2 are the most similar and are analogous to human SUMO2/3 and found to be the most involved in SUMO conjugation followed by SUMO3 (analogous to SUMO1 in humans).The functional importance of SUMO1/2 can be attributed to the fact that the sumo1sumo2 double mutant in Arabidopsis is embryo lethal.It is worth noting that each homolog of SUMO has a unique sequence at its C-terminal extension, unlike in Ubiquitin.It is expected that this divergence in C-terminal extension in SUMOs in Arabidopsis may impart selectivity for different adaptative functions.
A protein substrate can either undergo SUMOylation or interact through its SUMO Interacting Motif (SIM) with a SUMO-modified target to generate an array of differential proteoforms giving rise to a wide range of protein functionalities.The presence of multiple homologs of SUMO modifiers suggests specificity in regulating different developmental processes.The majority of SUMOylated proteins can form noncovalent interactions with proteins possessing SIMs.The SIM motif is characterized by hydrophobic residues surrounded by acidic amino acids that form a beta-sheet secondary structure (Elrouby et al. 2013).Ubiquitin can interact with Ubiquitin Interacting Motifs (UIM) on proteins (Miller et al. 2004;Gao et al. 2021); however, the presence of UIMs in proteins is not well studied in plants.
The E1 conjugating enzyme in SUMOylation is composed of a heteromeric subunit of SAE (SUMO Activating Enzyme) comprising SAE1 (regulatory) and SAE2 (catalytic) subunits, while for ubiquitin the E1 is constituted by a single E1 protein, UAE (Ubiquitin Active enzyme).
In Arabidopsis alone, there are 37 E2 conjugating enzymes that transfer Ubiquitin to E3 Ubiquitin Ligases or catalyze the transfer of Ubiquitin directly to their substrates (Miricescu et al. 2018).However, only 1 SUMO E2 enzyme (SUMO Conjugating Enzyme 1; SCE1) has been reported so far, which facilitates SUMO conjugation to its substrate protein or to the SUMO E3 Ligase for conjugation onto substrates.SCE1 like SAE2 is critical for the survival of the plant as its deletion is lethal (Saracco et al. 2007).
Ubiquitin E3 ligases are diverse and come in at least 1,500 different forms in Arabidopsis alone.They include HECT, RING, Kelch-type, and U-box proteins, suggesting the convergent evolution of different protein families to perform the same function of facilitating the ubiquitylation of target proteins (Sadanandom et al. 2012).Most of the plant hormone receptors are ubiquitin E3 ligases, highlighting the importance of the Ub systems in plants (Blázquez et al. 2020).On the other hand, the SUMO system to date has only 2 confirmed types of E3 ligases, SIZ1 (SAP and MIZ1 domain-containing ligase1) and MMS21/HPY2 (from here on referred to as HPY2), that facilitate the attachment of SUMO chains to substrate proteins (Gou et al. 2017).HPY2 (High Ploidy2) SUMO E3 ligases are dispensable in Arabidopsis as siz1hpy2 double mutant plants are viable but have severely reduced growth and development (Castro et al. 2018a).In certain instances, like in yeast and animals, the E2 enzyme SCE1 alone can conjugate SUMO to its substrate protein in plants (Varejao et al. 2020;Ghimire et al. 2021).The contrasting evolutionary paths of ubiquitin and SUMO ligases underscore a fundamental regulatory divergence between the 2 systems: Ubiquitin E3 ligases have evolved through positive selection, diversifying to enable dynamic regulation of ubiquitylation, whereas SUMO E3 ligases have been conserved due to negative selection, maintaining a stable regulatory role.

SUMOylation in the context of plant terrestrialization
Over the years, researchers have made significant strides in understanding the role of SUMO modification in the responses of a handful of model plants to environmental cues.However, 2 important interconnected questions remain largely unexplored: the relevance of SUMO for crop domestication and plant breeding, and the role of SUMO in the generation of adaptive traits during evolution.
Reconstruction of the evolutionary history of a particular pathway requires the comparative analysis of key extant lineages.Thus, to investigate the participation of SUMO in the transition of plants from an aquatic to the terrestrial environment that occurred almost 500 MYA (Umen 2014; de Vries and Archibald 2018), existing species within key lineages of algae and land plants over this period were studied.Land plants (or embryophytes) are comprised of 2 major lineages: Bryophytes or nonvascular land plants (including mosses, liverworts, and hornworts), and Tracheophytes or vascular plants (including lycophytes, ferns, gymnosperms, and angiosperms).The last common ancestor of land plants was derived from a streptophytic algae, which also gave rise to the sister clade of zygnematophytic algae.The analysis of the streptophytic alga genome has provided critical evidence for the evolution of terrestrialization of plant species (Wodniok et al. 2011;Wang et al. 2020).A study in unicellular photosynthetic chlorophytic alga Chlamydomonas reinhardtii (Cr) has provided indications of the minimum SUMO modification system that could be available before plant terrestrialization.Chlamydomonas, like budding yeast, contains just 1 copy of each SAE1 and SAE2 component of the SUMO-activating enzyme and 1 copy of SIZ1 and HPY2 SUMO E3 ligases.However, in contrast to budding yeast, Chlamydomonas contains 3 homologs of SUMO modifiers, 3 homologs of SCE1, and 12 homologs of SUMO proteases.
Cr has 6 SUMO modifiers out of which only 3 are homologous to Arabidopsis SUMO1-SUMO2.The 161-bp physical separation between 2 of these SUMO modifiers suggests tandem duplications have given rise to these SUMO homologs in the Cr genome (Wang et al. 2008).
Among the SUMO proteases, 6 have the C48 peptidase domain found in ULP-type SUMO proteases, whereas 6 members belong to a DESI group of C97 peptidase domain of SUMO proteases (Lin et al. 2020).In the transition of C. reinhardtii cells from their optimal growth temperature of 25°C to stress-inducing 37°C, an upsurge in SUMO-conjugated proteins was observed.Intriguingly a 30-minute treatment resulted in a higher abundance of SUMO-conjugated proteins compared with a 1-h treatment, suggesting that deSUMOylation is a critical factor affecting SUMO conjugate accumulation, perhaps explaining the need for an expanded repertoire of SUMO proteases in Cr.Moreover, SUMOylation was also demonstrated to be critical for facilitating phototrophic movement in Chlamydomonas (Wang et al. 2008), providing a glimpse of an ancient role for SUMO in red and blue light signaling akin to what has been identified in Arabidopsis (Sadanandom et al. 2015;Srivastava et al. 2022).Chlamydomonas may provide a simpler system to understand how SUMOylation was recruited for light perception and signaling in phototrophic organisms.
Very little is known about SUMO and its targets in primitive land plants, i.e.Tracheophytes.The progression of organismal complexity, as they began to colonize land, particularly in terms of growth form, is understood to be gradual.This evolutionary journey is evident from the simple unicellular structure observed in algae, such as Mesostigma, through to filamentous growth patterns observed in Klebsormidium, and culminating in more complex multicellular structures found in Chara, which include specialized structures resembling rhizoids and stems (Umen 2014).Across this diverse range of species, a remarkable conservation of gene number encoding the core SUMO conjugation machinery (E1-E3) is observed.This evidence suggests that SUMO conjugation is indispensable but not necessarily the driving force behind this evolutionary journey.However, as complexity in form arises in the plant lineage, we observe the appearance and gene expansion of specific components of SUMO modification t act beyond the initial SUMO conjugation step.We explore the evolutionary significance of this selection and how it might have contributed to the emergence of land plants.

Evolution of the SUMO system in plants
To elucidate the evolutionary trajectory of SUMO components throughout the plant kingdom, we conducted an in-depth analysis of sequence homology across several major orders spanning diverse plant lineages.The phylogenetic analysis conducted using publicly available full-genome sequences and data from the OneKP database (One Thousand Plant Transcriptomes 2019) reveals the widespread distribution of almost all components of the SUMO machinery across various plant lineages (Fig. 3A; Supplementary Fig. S2).The occurrence of machinery driving SUMO modifications spans the breadth of the plant kingdom, from unicellular photosynthetic algae to dicots and monocots.Across this spectrum, a core set of SUMO components emerges consistently, including the SUMO modifier (SUMO1), activating enzymes (SAE1 and SAE2), conjugating enzyme (SCE1), ligating enzyme (E3), and proteases (ULP1 and DESI1).
The SUMO1-type proteins most homologous to AtSUMO1 are present in all major lineages while those most homologous to AtSUMO2 are found in fern, Gymnosperms, and Angiosperms, suggesting a more recent emergence (Fig. 4A).It was previously suggested that SUMO3-8 have emerged independently in Brassicaceae (Hammoudi et al. 2016).Arabidopsis SUMO3 had emerged from SUMO2.AtSUMO6 and AtSUMO8 are tandem duplications from AtSUMO4 and AtSUMO7, respectively (Hammoudi et al. 2016).SUMO gene duplication also independently occurs in many other plant species (the same phenomenon being observed in animal SUMO modifiers).It appears that SUMO modifiers have undergone spontaneous gene duplication events across different lineages through the evolution of land plants.This is particularly evident in the cluster of SUMO1-type modifiers where gene duplication has occurred mostly in Bryophytes and Lycophytes, whereas SUMO2-type duplications have evolved mostly in Gymnosperms and Angiosperms.For example, there are 4 copies of SUMO1 homologs in moss S. fallax and lycophyte S. moellendorfii but 5 and 3 copies on SUMO2 homologs in P. taeda and G. soja, respectively.
Interestingly, the C-terminal end of SUMO, where the processing of the GG motif converts it to its active form, bears considerable diversity across the plant kingdom (Supplementary Fig. S1).Interspecific diversity can be observed in Sphagnum recurvatum, S. lescurii, and S. palustre, which have a free GG motif (or no C-terminal extension) except for S. fallax and S. cuculata.This suggests that certain species have evolved tighter regulation of SUMOylation by the addition of an extra processing step.However, in Equisetum hymale and Pinus taeda, we find multiple homologs of SUMO, with and without extensions beyond terminal diglycine.Interestingly, a variant version of the SUMO modifier also known as SUMO-v was reported in Zea mays to have an extensive N-terminal domain preceding the ß-grasp motif but lacked a typical C-terminal GG tail.Genes encoding for SUMO-v were found to be conserved across land plants (Augustine et al. 2016).Another set of tandem di-peptide repeats of ß-grasp motifs known as the DSUL protein was found to be expressed in floral tissues of selected monocot cereal crop species like Zea mays, Brachypodium, Panicum, Sorghum, and Oryza sativa and also showed a lack of conservation of the GG motif (Augustine et al. 2016).This new finding of 2 different types of SUMO C termini within the same cell type suggests that 2 types of SUMO modification can occur: a fast "hardwired" SUMOylation response when the GG extension is absent or a controlled process through C-terminal processing SUMO proteases as an adaptive response to environmental cues.The presence of an extra copy of SUMO in plants may impart a tissue-specific and/or condition-specific tailored SUMOylation response to different biotic and abiotic stresses.These different modes of interaction with target proteins provide the SUMO system with greater flexibility for modifying cell signaling.The presence of extra copies of SUMO in plants may also impart a tissue-specific and/or condition-specific SUMOylation response to different biotic and abiotic stresses.
The SUMO Activating Enzyme 1 (SAE1) subunit of SUMO E1 activates SUMO to transfer to the E2 enzyme in a 2-step process involving adenylation and thioester bond formation to a catalytic cysteine residue (Lois and Lima 2005).In Arabidopsis alone 2 homologs of SAE1a and SAE1b are found that share 82% homology with each other (Supplementary Fig. S3).The vicinal Asp coordinates Mg 2+ ion in the ATP-Mg complex and plays an important role in adenylation activity (Lois and Lima 2005;Olsen et al. 2010).SUMO E1s mostly occur as single copy genes, and the catalytic domain is well conserved across the plant kingdom (Supplementary Fig. S4).Interestingly in some algal species, including Chlamydomonas, Mesotaenium, Klebsormidium, and Penium, we observed the absence of sequence conservation in either of the Asp or Cys residues.This suggests that photosynthetic algae possess an ancestral form of E1 enzyme that may have a different mechanism to form thiol ester linkage with SUMO proteins.Deleting the SUMO E1 is lethal in eukaryotes, and the acquisition of a catalytic Asp/Cys dyad may be an evolutionary advantage in the terrestrial ecosystem.
The SAE2 (SUMO activating Enzyme 2-catalytic subunit) contains 3 domains comprising of a cysteine rich E1-UbL (Ubiquitin like) domain (C-XX-C) that attaches to a zinc ion, followed by an catalytic cysteine site that forms a E1-SUMO thioester bond and a C-terminal UbL domain consisting of second C-XX-C motif (Lois and Lima 2005).Sequence alignment reveals SAE2 to be a part of a multigene family with several homologs present in a single organism (Supplementary Figs.S5 and S6).However, in several cases homologs across different genera in the plant kingdom show considerable sequence mismatch, particularly at their catalytic sites.This mismatch is commonly observed in algae (Chlamydomonas, Chloromonas, Mesotaenium, Chara), bryophytes Evolution of the SUMO system in plants | 3135 (Marchantia, Sphagnum, Nothoceros), pteridophytes (Selaginella), and some modern angiosperms (Solanum, Glycine).The predicted SAE2 subunit sequences show limited conservation in 3 of its catalytic domains, which are important for its E1 activity.The appearance of variant isoforms across different plant species indicates that SAE2 has undergone evolutionary divergence, potentially to acquire new functions.
The SUMO Conjugating Enzyme (SCE1) forms a thioester bond with SUMO1 before transferring it to the E3 ligase or the protein substrate (Yunus and Lima 2006).SCE1 is present mostly as a multicopy gene with a highly conserved catalytic active cysteine residue throughout the plant kingdom (Supplementary Figs.S7  and S8; Table 1).
The SIZ1 SUMO E3 protein facilitates the transfer of the SUMO molecule from E2 to the lysine residue of the protein substrate.SIZ1 contains a SP-RING domain that forms a tetrahedral configuration of cysteine, histidine, and 2 cysteine molecules coordinating Zn 2+ , which are crucial for its activity (Yunus and Lima 2009).The catalytic sites that contain these amino acids are conserved across the plant kingdom, as shown by the multiple sequence alignment in Supplementary Fig. S9.The SUMO E3 Ligase, HPY2, exhibits conservation from chlorophytes to spermatophytes, whereas SIZ1 is notably absent in Chlamydomonas.However, in algae, such as Spirogloea and Coccomyxa, SIZ1do not exhibit sequence conservation at these catalytic residues, suggesting the existence of an ancestral form of SIZ1 in these early photosynthetic microorganisms.Interestingly, another SUMO E3 ligase, also known as HPY2, is less abundant across the plant kingdom (Supplementary Fig. S10).The sequence alignment shows conservation of cysteine and histidine residues at the catalytic tetrahedral domain.Nevertheless, we observe sequence diversity at these residues in Chlamydomonas, Ostreococcus, and Spirogloea algal species (Supplementary Fig. S11) like SIZ1.Interestingly, SIZ1 and HPY2 always exist as a pair across the plant lineage, which suggests independent parallel evolution in plants as they are functionally not interchangeable.The conserved nature of SUMO E3 ligases and lack of homologs suggest that SIZ1 and HPY2 have specific role in temporal and spatial regulation of SUMO conjugation to protein substrates.

Evolution of PIAL SUMO E4 ligases
One of the major events in the SUMO system that marked the change from unicellular to multicellular plants was the emergence of SUMO E4 ligases.Arabidopsis encodes 2 SUMO E4 ligases called PROTEIN INHIBITOR OF ACTIVATED STAT LIKE1 (PIAL1) and PIAL2.These SUMO E4 ligases contain SP-RING domains, suggesting that they could also act as E3s.However, they have been demonstrated to create SUMO chains through isopeptide linkages and need SCE1 for its function (Tomanov et al. 2014).Arabidopsis mutant analysis shows that PIAL1 and 2 are required for salt and osmotic stress responses and can alter sulfur metabolism, yet the mutants grow normally under ordinary conditions (Tomanov et al. 2014).PIALs are involved in addition of polySUMOylation chains to the protein substrate, which are bound by STUBL (SUMO-Targeted Ubiquitin Ligases) proteins to be marked for proteasomal degradation (Tomanov et al. 2014;Han et al. 2016).Other than its E4 Ligase activity, PIALs are also involved in transcriptional silencing complex through its interaction with Morpheus' Molecule 1 (MOM1) containing complex (Han et al. 2016).
In our phylogenetic analysis, we observed that the PIAL proteins were absent in the unicellular algal species.This is corroborated by previous reports showing lack of PIAL proteins in C. reinhardtii (Lin et al. 2020).These algal species like Chloromonas, Trebouxia, and Chlamydomonas survive as an independent singlecelled organism in the environment (Fig. 2A, Supplementary Figs.S12 and S13).However, with the emergence of multicellular algal species like Mesotaenium and Spirogloea, where the transition from unicellular to filamentous multicellular structures occurs, we find the emergence of PIALs, which imparts polySUMOylation of protein substrates perhaps to be followed by its subsequent degradation through STUBLs.The SUMO E4 ligase shows sporadic occurrence in organisms belonging to the chlorophytes and streptophytes, emerging more prominently in nonvascular bryophytes (Table 1).Intriguingly, SUMO-conjugated with SCE1 can form SUMO chains even in the absence of PIALs although less efficiently than along with PIAL (Tomanov et al. 2014).
The lack of PIALs in chlorophytic algae suggests that mono and/or multi SUMOylation is the main form of SUMO modification in these unicellular algae.It is tempting to speculate that ability to form PolySUMO chains may be a key feature of the ability to attain multicellularity in the plant lineage (Fig. 3).However, we cannot deny lack of polySUMOylation in these algal species as SIZ1 or HPY2 E3 ligase may also add polySUMO chains to its protein substrate.Further studies need to be undertaken to verify the occurrence of polySUMOylation in unicellular algal species.

Evolution of SUMO proteases: providing clues for specificity in adaptation
SUMO proteases are cysteine proteases that play a major role in the deconjugation of SUMO from protein substrates.This process helps in the recycling of SUMO components back into the conjugation system and controls cellular SUMO conjugation levels.SUMO proteases also facilitate cleavage of the C-terminal extensions and affect mature SUMO flow into the SUMOylation cycle.
Comparative genomics and phylogenetic analyses have shown that different land plant species have a wide range of SUMO proteases, which shows how important they are for cellular processes (Morrell and Sadanandom 2019).These studies have revealed intriguing patterns of gene duplication, loss, and functional diversification, highlighting the dynamic nature of SUMO proteases in land plant evolution.Additionally, the identification of conserved domains and motifs within these proteases has provided insights into their structural and functional characteristics.Based on the amino acids present in the active site, SUMO proteases can be class, which mainly consists of ULP (Ubiquitin-Like Proteases) that belong to the C48 protease family.These proteases have a catalytic triad at their active site, which consists of histidine, glutamine (or asparagine), and cysteine (Supplementary Fig. S14).
The second type is the CP class of enzymes, which comprises the C97 cysteine protease family.The DeSI (DeSumoylating Isopeptidase) is in this category of proteases.The CP (Cysteine protease) class of enzymes has a characteristic catalytic dyad at its active site, which consists of histidine and cysteine only (Supplementary Fig. S14).The DeSI proteases lack pre-SUMO processing peptidase activity to give rise to mature SUMO forms (Gillies and Hochstrasser 2012;Suh et al. 2012).
The phylogenetic analysis of the ULPs suggests that they can be broadly categorized into 4 groups (Supplementary Fig. S14).ELS (ESD4-Like SUMO protease) and ESD4 (Early in Short Days 4; ULP1-like) and SPF (SUMO Protease related to Fertility; ULP2-like) are present in all lineages, suggesting these are ancient types of ULPs.FUG type ULP proteases are a small group found only in 2 lineages-Gymnosperm and Angiosperm-suggesting that this group has emerged more recently (Castro et al. 2018a).The OTS-type is absent in microalgae, bryophytes, and lycophytes (Fig. 2, B  and D).This highlights the neofunctionalization of FUG-type and OTS-type SUMO proteases in more complex plants.Interestingly, we identified another group of ULP2-like proteases related to the SPF-type cluster (we designate as SPF-like), which is exclusively present in bryophytes, lycophytes, and ferns (Fig. 3B, Supplementary S14).This group of proteases indicates the occurrence of a novel, independent parallel evolution of the ULP clade in early non-seed-bearing land plants.
ULP1 is among the first SUMO proteases discovered and was found to regulate the G2/M phase transition in yeast cell cycle (Li and Hochstrasser 1999).In Arabidopsis alone, 8 homologs of ULPs have been reported to be actively involved in modulating a range of stress factors and influencing developmental processes in plants (Morrell and Sadanandom 2019).ESD4 (EARLY IN SHORT DAYS 4) and ELS1 (ESD4 Like SUMO protease) are ULP1 like proteases that have a CE domain.The catalytic triad consists of histidine, aspartate, and cysteine residues that are conserved across the plant kingdom.It is ubiquitously found in unicellular alga to complex multicellular land plants (Supplementary Fig. S15; Table 1).The critical roles of ULPs in pre-SUMO processing and isopeptidase activity for SUMO modification enable organisms to fine-tune SUMO regulation.This has led to gene expansion and emergence of multiple homologs, perhaps facilitating SUMO mediated adaptation to a wide range of environmental conditions.(Supplementary Fig. S16).
ELS1/ESD4, ULP type cysteine protease is mostly observed in land plants, indicating a role for adaptation to terrestrial environments.ELS2 and ESD4 appear in streptophytes, while ELS1 is present in chlorophyta and is common in the early vascular and nonvascular plants (Fig. 4B).The evolution of ELS1 in Chlamydomonas appears to parallel the development of Brassinosteroid (BR) signaling pathways in chlorophytes.Recent studies in liverworts, showed that the transcription factor BZR1/ BES1 levels, regulated by BR signaling, is linked to the facilitation of gametophyte formation (Furuya et al. 2024).ELS1(ULP1a) regulates the SUMOylation status of Arabidopsis BZR1, a major plant transcription factor in the BR pathway.ULP1a aids BZR1 activity in signaling during salt stress to adapt plant growth (Srivastava et al. 2020).Remarkably, we observe a clear correlation between the presence of ELS1/ESD4 ULPs in the plant lineage and the evolutionary emergence of BR signaling, which is thought to have originated in single cell plants (Fig. 5) (Ferreira-Guerra et al. 2020;Kour et al. 2021).Moreover, in response to elevated ambient temperatures, the DESI3a-mediated deSUMOylation of the BR receptor BRI can dampen BR-mediated plant growth (Naranjo-Arcos et al. 2023).These findings underscore the coevolutionary dynamics between SUMO proteases and phytohormones in shaping adaptive responses across plant taxa.By modulating the activity of key signaling components, such as BZR1 and BRI, SUMO proteases intricately regulate plant growth and stress responses in a manner that allows developmental processes to be integrated with environmental cues.
OTS1 and OTS2, known as OVERLY TOLERANT TO SALT 1 and 2, represent a separate category of ULP1 cysteine proteases (Supplementary Fig. S25).They are absent in bryophytes (Marchantia, Physchomitrium, Anthoceros, Takakia, and Sphagnum), pteridophytes (Selaginella, Equisetum, Polypodium), gymnosperms (Gnetum, Pinus, Picea), and a few angiosperms (Amborella, Solanum, Aloe, Zostera) (Supplementary Fig. S26).OTS arises from tracheophytes and undergoes several rounds of endoduplications across spermatophytes.This reveals that OTS SUMO proteases have evolved mostly in angiosperms.The presence of OTS SUMO proteases in angiosperms suggests that these proteases have played a role in the evolution and adaptation of flowering plants.OTS levels play a critical role in controlling plant responses against salt and drought stress in Arabidopsis and rice (Conti et al. 2008;Srivastava et al. 2016aSrivastava et al. , 2016b)).Their absence in algae and bryophytes further highlights their significance in the development of more complex plant structures and functions (Fig. 5).This suggests a multifaceted role for OTS in orchestrating stress adaptation mechanisms in angiosperms.The occurrence of the defense hormone salicylic acid and its signaling mechanism has been elaborated in angiosperms (Monte 2023).It has been reported that OTS1 can downregulate salicylic acid levels to control plant immunity against bacterial pathogens (Bailey et al. 2016).Furthermore, OTS1 levels can also promote SUMOylation of JAZ proteins to attenuate JA signaling (Srivastava et al. 2018).This suggests that the evolution of OTS in angiosperms facilitated the regulation of defense pathways in land plants.Additionally, OTS plays an important role in controlling transcriptional gene silencing by modulating the activity of DNA Polymerase V (Liu et al. 2017a).In addition to its role in stress response, OTS has been found to regulate flowering and stamen and seed development in Arabidopsis (Campanaro et al. 2016;Srivastava et al. 2016b).This implies that the presence of OTS might have played a significant role in shaping the processes of gametophytic generation and seed development in angiosperms (Fig. 5).
The transition to terrestrial life in plants was marked by the evolution of active vascular systems, enabling long-distance nutrient transport and mechanical support (Blázquez et al. 2020;Preston et al. 2022).This adaptation was crucial for the diversification of spermatophytes, with significant variations in root cellular anatomy between angiosperms and gymnosperms (Koonin 2010;de Vries and Archibald 2018;Motte and Beeckman 2019).Lateral root development, originating from the pericycle's stem cell zone or merophyte, exhibits distinct patterns between these groups (Motte and Beeckman 2019).Lycophytes display a rigid merophyte arrangement limiting lateral root formation, whereas the flexible arrangement in angiosperms supports rapid growth and adaptability to soil conditions (Chen et al. 2016).Angiosperm trees, with their larger root diameters, enhanced branching, and efficient xylem vessels, are adapted for better water translocation compared with the tracheid-dominated gymnosperms (Sperry et al. 2006; Evolution of the SUMO system in plants | 3139 Liese et al. 2017;Motte and Beeckman 2019).This anatomical difference underpins the angiosperms' ability to thrive in varied environmental conditions, including colder and nutrient-poor soils, through increased root lignification and nitrogen uptake (Zanne et al. 2014).Conversely, gymnosperms' lower root proliferation and branching are suited to stable environments with consistent leaf litter (Liese et al. 2017).
The overexpression of OTS1 SUMO protease facilitates increased root length and branching, whereas its silencing reduces root growth (Conti et al. 2008;Srivastava et al. 2016a).The role of OTS1 SUMO protease in promoting root elongation and branching in angiosperms highlights the evolutionary significance of SUMO proteases in adapting root development to terrestrial challenges, demonstrating the intricate link between SUMOylation and plant adaptation to new environments.
SPF1 and SPF2 (SUMO PROTEASE RELATED TO FERTILITY 1/2) belong to the ULP2 class of SUMO proteases having both SUMO maturation and proteolytic activity.They are completely absent in algae but are present in some classes of vascular plants including pteridophytes (Supplementary Fig. S23; Table 1).They occur as a homologous pair SPF1 and SPF2 in Arabidopsis.The catalytic triad residue comprising of histidine, aspartate, and cysteine is conserved throughout (Supplementary Fig. S24).The occurrence of SPF in land plants suggests SPF proteases to have evolved in Tracheophytes playing a role in the evolution of a predominant sporophytic generation giving rise to seed-bearing Spermatophytes (gymnosperms and angiosperms).This evolutionary development may have played a crucial role in the reproductive success and survival of gymnosperms and angiosperms.SPF1 along with SPF2 is important for maintenance of plant fertility and affects gametophyte development as well as embryo formation in Arabidopsis (Liu et al. 2017b;Castro et al. 2018b;Liu et al. 2019).SPF promotes photomorphogenesis in the presence of red light by deSUMOylation of MYC2 whereas this phenomenon is reversed under blue light (Srivastava et al. 2022).The ability of SPF in sensing and responding to different wavelengths of light by altering the SUMOylation status of its substrates highlights the adaptive responses underpinned by SUMO in land plants to convert different environmental cues into developmental signals (Fig. 4).
The ULP2 type protease FUG1 (Fourth ULP Gene class1) has been identified as having emerged relatively recently, tracing back to the Cretaceous period coinciding with the rise of Spermatophytes (Fig. 4).A recent study has elucidated FUG1's role in deSUMOylating the epigenetic gene silencer AL3 (Alfinlike family) induced by repeat expansion induced epigenetic gene silencing, consequently triggering histone methylation (H3K4me) and impeding plant growth by interacting with the Polycomb repressor complex (PRC), a process akin to genomic imprinting (Sureshkumar et al. 2024).In Angiosperms, the development of endosperm and seed formation critically hinges on the genetic imprinting status of seed development genes (Bauer and Fischer 2011;El-Sappah et al. 2021).The emergence of FUG1 in seed-bearing spermatophytes and its involvement in establishing methylation marks for seed development suggests an evolutionary adaptation of SUMO proteases to facilitate the molecular reconfiguration of developmental traits.
The catalytic domain of DeSI1 (comprising of cysteine and histidine) is present in most plants from unicellular algae to land plants (Supplementary Fig. S17).Although DeSI1 has undergone gene expansion, the catalytic residues are conserved throughout its homologs (Supplementary Fig. S18).Interestingly, in Oryza sativa, DeSI1 is found to lack a cysteine residue at its catalytic site.This absence suggests the possibility of neofunctionalization occurring in DeSI1, possibly because of selection pressure for high-yielding crops during the process of domestication.Similar to DeSI1, DeSI2 is present as a multigene family and has its catalytic domain is conserved throughout algae, bryophytes and tracheophytes (Supplementary Figs.S19 and S20).Intriguingly in Polypodium hesperium, a lycophyte we find a DeSI2 isoform lacking the catalytic cysteine residue.This isoform could potentially be an ancestral gene in ferns from which the present day DeSI2 has arisen.The presence of DeSI1 and DeSI2 homologs in unicellular algae to land plants suggests an important role in photoautotrophic nutrient acquisition (Fig. 3C).DeSI3A, DeSI3C, and DeSI4 also possess cysteine and histidine residues in their catalytic domain (Supplementary Figs. S21,S22 and S27).They are absent in unicellular algae but have arisen as a unigene family in certain multicellular algal species.They are also found in certain bryophytes and pteridophytes.However, they are most commonly found in modern land plants where they have undergone gene expansion (Fig. 5).These findings suggest that DeSI3 and DeSI4 have acquired a role in the evolution and adaptation of plants to terrestrial environments.Notably DeSI3a regulates SUMOylation of FLS2, the bacterial flagellin receptor critical for mounting a potent immune response (Orosa et al. 2018).In this context the occurrence of DeSI3/4 along with its targets across Pteridophytes (such as Selaginella, Physcomitrella) can divulge the identity of key immune related processes that have allowed plants to colonize land (Fig. 5).Presently, there is a limited understanding of the potential targets of DeSI in plants, making it a focal point for future understanding.
In summary, the discovery of diverse SUMO components across the plant kingdom, from unicellular algae to multicellular land plants, sheds light on the evolution of SUMO components that responds to adaptability of these organisms to different environmental conditions.In this context neofunctionalization of SUMO proteases as opposed to SUMO E3s within plant genomes highlights a different path taken by SUMO modification when compared withubiquitin as plants adapted to land.The presence or absence of these proteases in various plant lineages provides insights into their role in the adaptation of plants to terrestrial environments and the development of complex structures.The findings presented in this review contribute to unravelling the intricate protein modification mechanisms that govern biological systems, laying the groundwork for future research addressing the role of SUMO and more generally peptide-based modification in early land plants.

Sequence identification and phylogenetic analysis methods
Sequences of SUMO machinery components in the green lineage were initially gathered with a BlastP local blast search in several databases including OneKP (One Thousand Plant Transcriptomes 2019) Phytozome (Goodstein et al. 2012) and published wholegenome sequences, using an E-value cutoff of 0.1 in most cases.The first search was performed with the corresponding A. thaliana sequences, and subsequent searches were done with bryophyte and algal sequences until no new sequences were retrieved.Subsequently, the results were manually checked using SMART (http://smart.emblheidelberg.de/)and Pfam (http://pfam.sanger.ac.uk/search) to ensure the presence of specific domains associated with the different components.A preliminary alignment and tree were performed with the OneClick/FastTree tool available through the NGPhylogeny website (Lemoine et al. 2019) which was then used to discard non-orthologous sequences.
The final alignments were obtained with MAFFT, which were followed by a combination of automatic BMGE (Criscuolo and Gribaldo 2010) and manual inspection to select the optimal region of phylogenetic inference.Phylogenetic trees were built using PhyML with Smart Model Selection (SMS) (Guindon et al. 2010).Bootstrap support was calculated with 1,000 replicates.The graphical representation of the phylogenetic trees was generated using iTOL (Letunic and Bork 2021) and the trees were rooted by using the midpoint.The final figures were edited manually.The models were created using Biorender.

Figure 1 .
Figure 1.Protein conjugation system in prokaryotes and eukaryotes.The protein modifiers composed of typical ßßαßßß tertiary structures common across all protein conjugation systems.Structure of Ubiquitin-related modifier-1 (URM1) protein from Saccharomyces cerevisiae, Small Archaeal Modifier Protein-1 (SAMP1) structure from Haloferax gibbonsii, ThiaminS (ThiS) structure in Escherichia coli K12, and Molybdenum cofactor biosynthesis protein D (MoaD) structure in E. coli K12.However, the analogous peptide modifier, Pup, in Mycobacterium sp.(strain KMS) lacks the ßßαßßß tertiary structure.In unicellular algae through land plants, this common ßßαßßß tertiary structure occurs in Ubiquitin, SUMO1, and Nedd8/Rub1.

Figure 3 .
Figure 3. Occurrence of core genes encoding SUMO system genes (A) and the expanded SUMO proteases, and (B) across different plant lineages.Dark filled circles indicate at least 1 positive hit in a search for the given gene encoding the respective component (rows) in different species (columns), confirmed by phylogenetic analysis.Light-grey filled circles indicate the presence of a plausible or distant ortholog hit.Names of the components are the ones used in Arabidopsis.Different background colors mark the major plant lineages.

Figure 4 .
Figure 4. Evolution of SUMO modifiers and proteases in plants.Putative evolutionary trajectory of (A) SUMO modifiers, (B) Ulp type SUMO proteases, and (C) DeSI proteases.Sign ">" indicates the emergence of new SUMO components from the last common ancestor.Green circles indicate the origin of new sub-families by duplication of pre-existing genes.Red crosses indicate putative gene losses in specific lineages.Abbreviations: An, angiosperms; Ch, chlorophytic algae; Fe, ferns; Gy, gymnosperms; Hw, hornworts; Lw, liverworts; Ly, lycophytes; Mo, moses; St, streptophytic algae.

Figure 5 .
Figure 5.The chronology of the emergence and evolution of the SUMO system components reveal a synchronized progression with the development of various adaptive traits in the plant kingdom.This timeline delineates key evolutionary milestones where specific SUMO components appeared, aligning with the emergence of new functionalities and survival strategies in plants as they adapted to changing environments and ecological niches over millions of years.The cladogram illustrates the timeline indicating the time of the branching points of the main plant lineages.